Off-Policy Interleaved $Q$ -Learning: Optimal Control for Affine Nonlinear Discrete-Time Systems
نویسندگان
چکیده
منابع مشابه
Optimal Finite-time Control of Positive Linear Discrete-time Systems
This paper considers solving optimization problem for linear discrete time systems such that closed-loop discrete-time system is positive (i.e., all of its state variables have non-negative values) and also finite-time stable. For this purpose, by considering a quadratic cost function, an optimal controller is designed such that in addition to minimizing the cost function, the positivity proper...
متن کاملSensitivity Approach to Optimal Control for Affine Nonlinear Discrete-time Systems
This paper deals with the optimal control problem for a class of affine nonlinear discrete-time systems. By introducing a sensitivity parameter and expanding the system variables into a Maclaurin series around it, we transform the original optimal control problem for affine nonlinear discrete-time systems into the optimal control problem for a sequence of linear discretetime systems. The optima...
متن کاملQ-learning for Optimal Control of Continuous-time Systems
In this paper, two Q-learning (QL) methods are proposed and their convergence theories are established for addressing the model-free optimal control problem of general nonlinear continuous-time systems. By introducing the Q-function for continuous-time systems, policy iteration based QL (PIQL) and value iteration based QL (VIQL) algorithms are proposed for learning the optimal control policy fr...
متن کاملQ-learning-based optimal digital feedback control with computation time delay of linear discrete-time systems
In embedded computers, there are delays due to computation time. Unless they are considered, a controlled system may be unstable. If the system is unknown, Q-learningbased optimal control is one of the useful approaches. Applying it to a system, we can obtain the optimal feedback gain for the unknown system. In this paper, we propose Q-learning-based optimal feedback control taking the delay in...
متن کاملIntegral Q-learning and explorized policy iteration for adaptive optimal control of continuous-time linear systems
This paper proposes an integral Q-learning for continuous-time (CT) linear time-invariant (LTI) systems, which solves a linear quadratic regulation (LQR) problem in real time for a given system and a value function, without knowledge about the system dynamics A and B. Here, Q-learning is referred to as a family of reinforcement learning methods which find the optimal policy by interaction with ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Neural Networks and Learning Systems
سال: 2019
ISSN: 2162-237X,2162-2388
DOI: 10.1109/tnnls.2018.2861945